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File: USPT 



Sep 12, 2000 



US-PAT-NO: 6119120 

DOCUMENT- IDENTIFIER: US 6119120 A 

TITLE: Computer implemented methods for constructing a compressed data structure 
from a data string and for using the data structure to find data patterns in the 
data string 

DATE- ISSUED: September 12 , 2000 



INVENTOR- INFORMATION : 
NAME 

Miller; John W. 



CITY 
Kirkland 



STATE 
WA 



ZIP CODE 



COUNTRY 



ASSIGNEE- INFORMATION : 
NAME 

Microsoft Corporation 



CITY 
Redmond 



STATE ZIP CODE 
WA 



COUNTRY TYPE CODE 
02 



APPL-NO: 8/ 673427 [PALM] 
DATE FILED: June 28, 1996 

INT-CL: [7] G06 F 17/30 

US-CL-ISSUED: 707/101/ 707/6, 707/7, 707/3 
US -CL- CURRENT: 707/101; 707/3, 707/6, 707/7 

FIELD-OF- SEARCH: 382/229, 382/230, 382/231, 707/6, 707/3, 707/7, 707/101, 707/2 
PRIOR-ART-DISCLOSED : 
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Search Selected 



Search ALL 



PAT-NO 
Q 5459739 



ISSUE-DATE 
October 1995 



PATENTEE -NAME 
Handley et al . 



US-CL 
371/136 



OTHER PUBLICATIONS 

"Dynamic Programming Alignment of Sequences Representing Cyclic Patterns", by Jens 
Gregor and Michael G. Thomason, IEEE Transactions on Pattern Analysis and Machine 
Intelligence, vol. 15, No. 2, pp. 129-135, Feb. 1993. 

"Searching Genetic Databases on Splash 2", by Dzung T. Hoang, Proceedings IEEE 
Workshop on FPGAs for Custom Computing Machines (Cat. No. 93TH0535-5) , pp. 185-191, 
Apr. 5, 1993. 

"Rapid-2, An Objecti-Oriented Association Memory Applicable to Genome Data 
Processing", by Denis Archambaud, Pascal Faudemay, and Alain Greiner Proceedings of 
the Twenty- Seventh Annual Hawaii International Conference on System Sciences, pp. 
150-159, Jan. 1994. 

"A Faster Algorithm Computing String Edit Distances", William J. Masek and Michael 
S. Paterson, Journal of Computer and System Sciences, 20, pp. 18-31, Aug. 6, 1979. 
"Synthesis and Recognition of Sequences", by S.C. Chan and A.K.C. Wong, IEEE 
Transactions on Pattern Analysis and Machine Intelligence, vol. 13, No. 12, pp. 
1245-1255, Dec. 1991. 

"Efficient Systolic String Matching", by G.M. Megson, Electronic Letters, vol. 26, 
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No. 24, pp. 2040-2042, Nc 1990. 

ART-UNIT: 273 

PRIMARY- EXAMINER: Au; Amelia 
ASSISTANT-EXAMINER: Frederick, II; Gilberto 
ATT Y- AGENT -FIRM: Lee & Hayes, PLLC 



ABSTRACT : 

A method for constructing a data structure for a data string of characters includes 
producing a matrix of sorted rotations of the data string. This matrix defines an A 
array which is a sorted list of the characters in the data string, a B array which 
is a permutation of the data string, and a correspondence array C which contains 
correspondence entries linking the characters in the A array to the same characters 
in the B array. A reduced A 1 array is computed to identify each unique character in 
the A array and a reduced C array is computed to contain every s.sup.th entry of 
the C array. The B array is segmented into blocks of size s. During a search, the A 1 
and C arrays are used to index the B array to reconstruct any desired row from the 
matrix of rotations. Through this representation, the matrix of rotations can thus 
be used as a conventional sorted list for pattern matching or information retrieval 
applications. A data structure containing only the A 1 , B, and C has very little 
memory overhead. The B array contains the same number of characters as the original 
data string, and can be compressed in a block wise manner to reduce its size. The A 1 
array is a fixed size equal to the size of the alphabet used to construct the data 
string, and the C array is variable size according to the relationship n/s, where n 
is the number of characters in the data string and s is the size of the blocks of 
the B array. Accordingly, the data structure enables a tradeoff between access speed 
and memory overhead, the product of which is constant with respect to block size s. 
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File: USPT 



Jan 4, 2000 



US-PAT-NO: 6012054 

DOCUMENT- IDENTIFIER: US 6012054 A 

TITLE: Database system with methods for performing cost-based estimates using spline 
histograms 

DATE -ISSUED: January 4, 2000 



INVENTOR- INFORMATION : 
NAME 

Seputis; Edwin Anthony 



CITY 
Oakland 



STATE 
CA 



ZIP CODE 



COUNTRY 



AS S I GNEE - I NFORMAT I ON : 

NAME CITY STATE 

Sybase, Inc. Emeryville CA 



ZIP CODE 



COUNTRY 



TYPE CODE 
02 



APPL-NO: 8/ 956631 [PALM] 
DATE FILED: October 23, 1997 

PARENT -CASE: 

RELATED APPLICATIONS The present application claims the benefit of priority from 
commonly- owned provisional application Ser. No. 60/057,408, filed Aug. 29, 1997 and 
now pending, entitled DATABASE SYSTEM WITH METHODS FOR PERFORMING COST-BASED 
ESTIMATES USING SPLINE HISTOGRAMS, the disclosure of which is hereby incorporated by 
reference . 

INT-CL: [6] G06 F 17/30 

US-CL-ISSUED: 707/3; 707/1, 707/2, 704/267, 704/258, 704/260, 395/500.02, 
395/500.03, 395/500.23, 364/474.29, 364/474.31, 364/468.03, 364/474.02 
US -CL- CURRENT: 707 /3; 700/146, 700 /187, 700/ 189 , 700 /97, 703 /2, 704 /2 58, 704/260, 
704 /267, 707/1, 707/ 2, 716/1 

FIELD -OF -SEARCH : 707/3, 707/1, 707/2, 364/474.29, 364/474.31, 364/193, 364/474.02, 
364/167.09, 364/468.03, 704/267, 704/258, 704/260, 395/500.03, 395/500.02, 
395/500.23 
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Gibbons et al . 
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Phillips et al. 
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Schiefer et al . 


707/2 
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5799311 
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Agrawal et al . 


707/102 


□ 


5822456 


October 1998 


Reed et al . 
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5838579 


November 1998 


Olson et al . 


364/488 
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5903476 


May 1999 


Mauskar et al . 


395/500.27 



OTHER PUBLICATIONS 

Poosala, V., Ioannidis, Y. , Haas, P., and Shekita, E., "Improved Histograms for 
Selectivity Estimation of Range Predicates, " ACM SIGMOD '96, Montreal, Canada, 1996, 
pp. 294-305. 

Piatetsky-Shapiro, G. and Connell, C, "Accurate Estimation of the Number of Tuples 
Satisfying A Condition, " ACM, 1984, pp. 256-276. 

Mannino, M., Chu, P., and Sager, T., "Statistical Profile Estimation in Database 
Systems," ACM Computing Surveys, vol. 20, No. 3, Sep. 1988, pp. 191-221. 



ART-UNIT: 277 

PRIMARY-EXAMINER: Fetting; Anton W. 
ASSISTANT-EXAMINER: Corrielus; Jean M. 
ATTY- AGENT- FIRM: Smart; John A. 



ABSTRACT : 

Database system and methods are described for improving execution speed of database 
queries (e.g., for decision support) by provides methods employing spline histograms 
for improving the determination of selectivity estimates. The general approach 
improves histogram-based cost estimates as follows. The constant associated with a 
predicate (e.g., in r.a>5, the constant is "5") is used to do a binary search in an 
array of histogram boundary values, for determining a particular histogram cell. 
Once a cell. has been found, the system employs interpolation to find out how much of 
the cell has been selected. Once this interpolation value is found, it is used with 
a cell weighting and a spline value or weighting to estimate the selectivity of the 
predicate value, which takes into account how data values are distributed within the 
cell. As a result of increased accuracy of estimates, the system can formulate 
better query plans and, thus, provides better performance. 
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File: USPT 



Jun 30, 1998 



US -PAT-NO : 5774588 

DOCUMENT- IDENTIFIER: US 5774588 A 

TITLE: Method and system for comparing strings with entries of a lexicon 
DATE-ISSUED: June 30, 1998 

INVENTOR- INFORMATION : 

NAME CITY STATE ZIP CODE COUNTRY 

Li; Liang Monroe CT 

ASSIGNEE- INFORMATION : 

NAME CITY STATE ZIP CODE COUNTRY TYPE CODE 

United Parcel Service of America, Inc. Atlanta GA 02 

APPL-NO: 8/ 477481 [PALM] 
DATE FILED: June 7, 1995 

INT-CL: [6] G06 K 9/36, G06 K 9/72 

US-CL-ISSUED; 382/230; 382/229 
US -CL- CURRENT: 382/230; 382/229 

FIELD- OF- SEARCH : 382/229, 382/230, 382/231 
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Egami et al . 
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Nagasawa et al . 
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February 1990 


Itoh et al. 




364/419 
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4979227 


December 1990 


Mittelbach et al . 




382/229 
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5050218 


September 1991 


Ikeda et al . 




382/100 
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5062143 


October 1991 


Schmitt 




382/229 
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5133023 


July 1992 


Bokser 




382/229 




□ 


5136289 


August 1992 


Yoshida et al . 




341/67 
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5261009 


November 1993 


Bokser 




382/229 
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5276741 


January 1994 


Aragon 
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5325444 


June 1994 


Cass et al . 
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5329609 


July 1994 


Sanada et al . 




395/2 .6 








FOREIGN 


PATENT DOCUMENTS 








FOREIGN- PAT-NO 


PUBN-DATE 


COUNTRY 


US-CL 


0 518 496 


December 


1992 


EPX 







OTHER PUBLICATIONS 

William B. Cavnar and Alan J. Vayda, Using Superimposing Coding of N-gram Lists for 
Efficient Inexact Matching, Environmental Research Institute of Michigan, pp. 
253-26-7, 480-493 . 

Owolabi et al . , "Fast Approximate String Matching," Software- -Practice and 
Experience, vol. 18, No. 4, pp. 387-393 (Apr. 1988). 

Takahashi et al . , "A Spelling Correction Method and Its Application to an OCR 
System," Pattern Recignition, vol. 23, No. 3/4, pp. 363-377 (Jan. 1990). 
Zobel et al . , "Finding Approximate Matches in Large Lexicons," Software- -Practice 
and Experience, vol. 25, No. 3, pp. 331-345 (Mar. 1995). 

William J. Masek and Michael S. Paterson, "A Faster Algorithm Computing String Edit 
Distances," of Journal Computer And System Sciences, 20, 18-13 (1980), pp. 18-31. 
Roy Lowrance and Robert A. Wagner, "An Extension of the String- to-String Correction 
Problem," Journal of the Association for Computing Machinery, vol. 22, No. 2, Apr. 
1975 pp. 177-183. 

Robert A. Wagner and Michael J. Fischer, "The String-to-String Correction Problem," 
Journal of Association for Computing Machinery, vol. 21, No. 1, Jan. 1974, pp. 
168-173. 

Sun Wu and Udi Manber, "AGREP--A Fast Approximate Pattern-Matching Tool , " Dept. of 
Computer Science University of Arizona. 

Edward M. Riseman, "A Contexual Postprocessing System For Error Correction Using 
Binary N-Grams" IEE Transactions On Computers, vol. C-23, No. 5, May 1974, pp. 480, 
481-493 . 



ART-UNIT: 266 

PRIMARY- EXAMINER : Johns ; Andrew 
ASSISTANT-EXAMINER: Davis; Monica S. 
ATTY- AGENT -FIRM: Jones & Askew, LLP 



ABSTRACT : 
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A system and method for more efficiently comparing an unverified string to a 
lexicon, which filters the lexicon through multiple steps to reduce the number of 
entries to be directly compared with the unverified string. The method begins by 
preparing the lexicon with an n-gram encoding, partitioning and hashing process, 
which can be accomplished in advance of any processing of unverified strings. The 
unknown is compared first by partitioning and hashing it in the same way to reduce 
the lexicon in a computationally inexpensive manner. This is followed by an encoded 
vector comparison step, and finally by a direct string comparison step, which is the 
most computationally expensive. The reduction of the lexicon is accomplished without 
arbitrarily eliminating any large portions of the lexicon that might contain 
relevant candidates. At the same time, the method avoids the need to compare the 
unverified string directly or indirectly with all the entries in the lexicon. The 
final candidate list includes only highly possible and ranked candidates for the 
unverified string, and the size of the final list is adjustable. 

17 Claims, 8 Drawing figures 
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File: USPT 



Jul 1, 1997 



US-PAT-NO: 5644657 

DOCUMENT- IDENTIFIER: US 5644657 A 

TITLE: Method for locating and displaying information in a pointer-based computer 
system 

DATE- ISSUED: July 1, 1997 



INVENTOR- INFORMATION : 
NAME 

Capps; Stephen P. 
Meier; John R. 



CITY 

San Carlos 
Cupertino 



STATE 

CA 

CA 



ZIP CODE 



COUNTRY 



ASSIGNEE- INFORMATION : 
NAME CITY 
Apple Computer, Inc. Cupertino 



STATE ZIP CODE 
CA 



COUNTRY 



TYPE CODE 
02 



APPL-NO: 8/ 456747 [PALM] 
DATE FILED: June 1, 1995 

PARENT -CASE: 

This application is a continuation of a co-pending application Ser. No. 08/001,121, 
filed Jan. 5, 1993 which in turn is a continuation-in-part of application Ser. No. 
07/889,660, filed May 27, 1992, and both of which are assigned to the assignee of 
the present application, and both of which are hereby incorporated by reference in 
their entirety. 

INT-CL: [6] G06 K 9/72 

US-CL-ISSUED: 382/229 
US -CL- CURRENT: 382 /229 

FIELD -OF -SEARCH : 382/181, 382/182, 382/187, 382/199, 382/228, 382/229, 382/309, 
382/155, 382/317, 345/121, 395/144-148, 395/155, 395/161 
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Suganuma et al . 
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4553261 
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Froessl 


382/309 
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4797946 


January 1989 


Katsuta et al . 


382/317 
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5038382 


August 1991 


Lipscomb 
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5157737 


October 1992 


Sklarew 


382/187 
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5165012 


November 1992 


Crandall et al . 


375/100 
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5172245 


December 1992 


Kita et al . 


358/403 


□ 


5179652 


January 1993 


Rozmanith et al . 
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5191622 


March 1993 


Shojima et al . 


382/187 


□ 


5317647 


May 1994 


Pagallo 


382/155 


□ 


5367453 


November 1994 


Capps et al . 


382/226 


□ 


5434929 


July 1995 


Beernink et al . 


382/187 


□ 


5452371 


September 1995 


Bozinovic et al . 


382/187 


□ 


5463696 


October 1995 


Beernink et al . 


382/186 


□ 


5479596 


December 1995 


Capps et al . 


395/148 


□ 


5500937 


March 1996 


Thomps on - Rohr 1 i ch 


395/161 


□ 


5528743 


June 1996 


Tou et al . 


395/148 



OTHER PUBLICATIONS 

O'Connor, Rory J. , "Apple Banking on Newton's Brain", San Jose Mercury News, 
Wednesday, Apr. 22, 1992 makes conjectures concerning anticipated features of an 
unreleased pen-based computer. 

Weiman, Liza and Moran, Tom, "A Step toward the Future", Macworld, Aug. 1992, pp. 
129-131. 

Soviero, Marcelle M., "Your World According to Newton", Popular Science, Sep. 1992, 
pp. 45-49. 

Abatemarco, Fred, "From the Editor", Popular Science, Sep. 1992, p. 4. 

A brochure describing the "PenBook" from Slate Corporation discusses one type of 

book reading system. It is believed that the PenBook system was released in about 

1991. 

Macintosh User's Guide, Apple Computer, Inc., 1991, pp. 114-117. 
ART-UNIT: 266 

PRIMARY- EXAMINER: Couso; Jose L. 

ATT Y- AGENT -FIRM: Hickman Beyer & Weaver 



ABSTRACT : 

A user interface is disclosed that facilitates easy find and display operations that 
search through the memory of a pointer based computing system. The user interface 
includes searching methods that are particularly well suited for use in a computer 
system in which the contents of the memory are divided into a plurality of 
searchable application files that are each capable of containing a plurality of 
records. In one aspect of the invention an improved find dialog box is disclosed. In 
another aspect, a method of selecting local verses global searches together with a 
method of conducting the chosen search and processing user inputs in response to the 
search results is disclosed. Additionally, an improved interface for displaying the 
results of various searches is described. 

39 Claims, 15 Drawing figures 



2 of 2 



4/11/02 10:56 AM 



Record Display Form 



httn://westbrs:8002/bin/gate.exe?f=doc&...s^&P Message=&p doccnt=1&p_doc l=PTFFRO 



we: 



□ 



Generate Collection 



Print 



L8: Entry 54 of 79 



File: USPT 



Dec 8, 1998 



US -PAT-NO: 58484 08 

DOCUMENT- IDENTIFIER : US 5848408 A 
TITLE: Method for executing star queries 
DATE-ISSUED: December 8, 1998 



INVENTOR- INFORMATION : 
NAME 

Jakob s son; Hakan 
Ozbutun; Cetin 
Waddington; William H. 

ASSIGNEE - INFORMATION : 
NAME 

Oracle Corporation 



CITY STATE 

San Francisco CA 

San Carlos CA 

Foster City CA 



ZIP CODE 



COUNTRY 



CITY 

Redwood Shores 



STATE ZIP CODE 
CA 



APPL-NO: 8/ 808621 [PALM] 
DATE FILED: February 28, 1997 

INT-CL: [6] G06 F 17/30 

US-CL-ISSUED: 707/3; 707/2 
US-CL -CURRENT: 707/3; 707/2 

FIELD -OF -SEARCH: 707/2, 707/3, 707/4, 707/5 
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U.S. PATENT DOCUMENTS 



Search Selected 



Search ALL 



COUNTRY TYPE CODE 
02 
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IS SUE -DATE 


PATENTEE -NAME 


US-CL 
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5249262 


September 1993 


Baule 
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PRIMARY -EXAMINER : Black; Thomas G. 
ASSISTANT-EXAMINER: Wallace, Jr.; Michael J. 
ATTY- AGENT -FIRM: McDermott, Will & Emery 



ABSTRACT: 

A method and apparatus for processing star queries is provided. According to the 
method, a star query is transformed by adding to the star query subqueries that are 
not in the query. The subqueries are generated based on join predicates and 
constraints on dimension tables that are contained in the original query. The 
subqueries are executed, and the values returned by the subqueries are used to 
access one or more bitmap indexes built on columns of the fact table. The bitmaps 
retrieved for the values returned by each subquery are merged to create one subquery 
bitmap per subquery. An AND operation is performed on the subquery bitmaps, and the 
resulting bitmap is used to determine which data to retrieve from the fact table. 
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PRIMARY- EXAMINER: Black; Thomas G. 
ASSISTANT-EXAMINER: Homere; Jean R. 
ATTY- AGENT- FIRM: Seed and Berry LLP 



ABSTRACT: 



A method and system for efficiently performing database table aggregation is 
provided. In a preferred embodiment, an aggregation facility efficiently aggregates 
a source table using indices on an aggregated column of the source table and a 
grouping column of the source table. The facility uses the index on the aggregated 
column to identify the contents of the aggregated column in each row of the source 
table. The facility further uses information derived from the index on the grouping 
column to identify the contents of the grouping column in each row of the source 
table. For each row of the source table, the facility aggregates the identified 
aggregated column contents into a result value for the identified grouping column 
contents. In a further preferred embodiment, the facility generates a relation 
mapping from source table row to grouping column, which the facility uses to 
identify the contents of the grouping column in each row of the source table. In a 
further preferred embodiment, the facility may be used to perform multiple- level 
aggregations, as well as aggregations in which there are multiple grouping columns, 
multiple aggregated columns, and/or multiple result columns. 
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ABSTRACT : 



A method and system for efficiently performing database table aggregation is 
provided in a preferred embodiment, an aggregation facility efficiently aggregates 
a source table using indices on an aggregated column of the source table and a 
grouping column of the source table. The facility uses the index on the aggregated 
column to identify the contents of the aggregated column in each row of the source 
table The facility further uses information derived from the index on the grouping 
column to identify the contents of the grouping column in each row of the source 
table For each row of the source table, the facility aggregates the identified 
aggregated column contents into a result value for the identified grouping column 
contents. In a further preferred embodiment, the facility generates a relation 
mapping from source table row to grouping column, which the facility uses to 
identify the contents of the grouping column in each row of the source table. In a 

Sh ^preferred embodiment , the facility may be used to perform ^ltiple^ level 
aggregations, as well as aggregations in which there are multiple grouping columns, 
multiple aggregated columns, and/or multiple result columns. 
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ABSTRACT : 

A new type of text search apparatus, capable of finding all occurrence positions of 
a search string that is an arbitrary character string, within a text which is 
written as a continous sequence of characters, utilizes for text position reference 
purposes in an index file, words which each occur {at least once within the text) as 
the maximum length word, referred to as an extension word, among a set of 
arbitrarily predefined dictionary words extending from a specific character 
position. Each such occurrence of a word as an extension word defines one of a set 
of text position elements, with that set covering all of the character positions of 
the text. The index file also includes a table which relates each of the extension 
words to the respective positions at which each of the partial character strings of 
the word occur within the word. Each occurrence of an arbitrary search string within 
the text can thereby be expressed as either a partial character string within a 
single text position element, or as a sequence of partial character strings within a 
set of sequentially occurring text position elements, so that all such occurrences 
can be found by utilizing the index file. 
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ASSISTANT -EXAMINER: Harrity; John 



ABSTRACT : 



An FSM data structure is encoded by generating a transition unit of data 
corresponding to each transition which leads ultimately to a final state of the FSM. 
Information about the states is included in the transition units, so that the 
encoded data structure can be written without state units of data. The incoming 
transition units to a final state each contain an indication of finality. The 
incoming transition units to a state which has no outgoing transition units each 
contain a branch ending indication. The outgoing transition units of each state are 
ordered into a comparison sequence for comparison with a received element, and all 
but the last outgoing transition unit contain an alternative indication of a 
subsequent alternative outgoing transition. The indications are incorporated with 
the label of each transition unit into a single byte, and the remaining byte values 
are allocated among a number of pointer data units, some of which begin full length 
pointers and some of which begin pointer indexes to tables where pointers are 
entered. The pointers may be used where a state has a large number of incoming 
transitions or where the block of transition units depending from a state is broken 
down to speed access. The first outgoing transition unit of a state is positioned 
immediately after one of the incoming transitions so that it may be found without a 
pointer. Each alternative outgoing transition unit is stored immediately after the 
block beginning with the previous outgoing transition unit so that it may be found 
by proceeding through the transition units until the number of alternative bits and 
the number of branch ending bits balance. 
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